Genome-wide association and genotype by environment interactions for growth traits in U.S. Red Angus cattle

Background Genotypic information produced from single nucleotide polymorphism (SNP) arrays has routinely been used to identify genomic regions associated with complex traits in beef and dairy cattle. Herein, we assembled a dataset consisting of 15,815 Red Angus beef cattle distributed across the continental U.S. and a union set of 836,118 imputed SNPs to conduct genome-wide association analyses (GWAA) for growth traits using univariate linear mixed models (LMM); including birth weight, weaning weight, and yearling weight. Genomic relationship matrix heritability estimates were produced for all growth traits, and genotype-by-environment (GxE) interactions were investigated. Results Moderate to high heritabilities with small standard errors were estimated for birth weight (0.51 ± 0.01), weaning weight (0.25 ± 0.01), and yearling weight (0.42 ± 0.01). GWAA revealed 12 pleiotropic QTL (BTA6, BTA14, BTA20) influencing Red Angus birth weight, weaning weight, and yearling weight which met a nominal significance threshold (P ≤ 1e-05) for polygenic traits using 836K imputed SNPs. Moreover, positional candidate genes associated with Red Angus growth traits in this study (i.e., LCORL, LOC782905, NCAPG, HERC6, FAM184B, SLIT2, MMRN1, KCNIP4, CCSER1, GRID2, ARRDC3, PLAG1, IMPAD1, NSMAF, PENK, LOC112449660, MOS, SH3PXD2B, STC2, CPEB4) were also previously associated with feed efficiency, growth, and carcass traits in beef cattle. Collectively, 14 significant GxE interactions were also detected, but were less consistent among the investigated traits at a nominal significance threshold (P ≤ 1e-05); with one pleiotropic GxE interaction detected on BTA28 (24 Mb) for Red Angus weaning weight and yearling weight. Conclusions Sixteen well-supported QTL regions detected from the GWAA and GxE GWAA for growth traits (birth weight, weaning weight, yearling weight) in U.S. Red Angus cattle were found to be pleiotropic. Twelve of these pleiotropic QTL were also identified in previous studies focusing on feed efficiency and growth traits in multiple beef breeds and/or their composites. In agreement with other beef cattle GxE studies our results implicate the role of vasodilation, metabolism, and the nervous system in the genetic sensitivity to environmental stress. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08667-6.

yearling weight. However, genomic selection on these traits should consider that low and high estimated breeding values (EBVs) for birth weight have been found to be associated with reduced calf viability, and increased rates of dystocia events and perinatal mortality, respectively [2,3]. Therefore, while birth weight has been considered a production indicator and treated as a selection criterion to increase calf viability as well as other economically important traits, modern beef breeding programs and production systems generally strive to increase calving ease while also maximizing both weaning weight and yearling weight [1, [3][4][5].
For at least two decades, studies have sought to identify quantitative trait loci (QTL) influencing bovine growth, body weight, and aspects of stature, including both linkage and modern genome-wide association analyses (GWAA); thereby underscoring the longstanding economic importance of efficient beef cattle production worldwide [6][7][8][9][10][11][12]. Moreover, QTL studies and modern genomic selection programs for economically important traits have been directly enabled by the generation of the bovine genome assembly, development of the Illumina Bovine SNP50 and 778K SNP arrays, and more recently, the demonstrated ability to accurately impute high-density genotypes, thereby enabling high-resolution analyses without the increased costs associated with direct genotyping [13][14][15][16][17][18][19][20]. Notably, several recent studies have established moderate heritability estimates for birth weight, weaning weight, and yearling weight in U.S. Gelbvieh, Angus, Limousin, Simmental, Hereford, and Red Angus beef cattle [20][21][22][23][24][25]. These studies also produce evidence for several relevant QTL and positional candidate genes; including orthologous genes LCORL and PLAG1 that affect both human and bovine height as well as pleiotropic QTL influencing feed efficiency, growth traits, and carcass traits across multiple U.S. beef breeds [6,10,12,20,[26][27][28][29][30][31]. However, the movement of germplasm (animals, semen, and embryos) across the U.S. in conjunction with the lack of tools to select for resilience to abiotic and biotic stressors has likely led to the loss of local adaptation in beef cattle [32]. Understanding genotype-by-environment interactions will allow us to identify the genes and biological processes involved in local adaptation. Genotype-by-environment (GxE) GWAA have been used alongside GWAA with the intent of identifying GxE interactions with complex traits [20,[33][34][35]. GxE GWAA are important to the beef industry as they identify individual ecoregions that could benefit from genomic selection [20,33,34].
The objective of this study was to identify loci with direct and genotype-by-environment effects on growth traits. Herein, we used 15,815 geographically diverse U.S. Red Angus beef cattle in conjunction with a union set of 836,118 (836K) imputed SNP variants to conduct GWAA and produce marker-based heritability estimates for birth weight, weaning weight, and yearling weight. Additionally, using thirty-year climate data and K-means clustering to assign all Red Angus beef cattle to discrete U.S. climate ecoregions, we estimated the significance of GxE interactions for birth weight, weaning weight, and yearling weight [32]. The present study represents the largest, high-density, single breed report to date that includes both standard GWAA and GxE GWAA for birth weight, weaning weight, and yearling weight; which was facilitated by an industry-supported research framework that includes accurate imputation to high-density genotypes for large-sample analyses [14,19,20]. The results of this study are expected to aid existing beef breeding programs and production systems by identifying QTL that may be included in future genotyping assays and genomic selection programs.

Heritability estimates for growth traits in U.S. Red Angus beef cattle
Marker-based heritability estimates (i.e., chip heritability) were produced for birth weight, weaning weight, and yearling weight using standardized relatedness matrices (G S ) with variance component analyses. Collectively, moderate to high heritability estimates with small standard errors (SE) were estimated for birth weight (0.51 ± 0.01), weaning weight (0.25 ± 0.01), and yearling weight (0.42 ± 0.01), respectively (Table 1). Moreover, these moderate to high heritability estimates for birth weight and weaning weight are similar to those produced by another study conducted on Red Angus cattle (0.58 ± 0.01 and 0.29 ± 0.01, respectively) [36]. Likewise, genetic correlations between traits were also high (birth weight and weaning weight = 0.54 ± 0.01; birth weight and yearling weight = 0.50 ± 0.01; weaning weight and yearling weight = 0.84 ± 0.01) (See Additional File 1). The results of our 836K single-marker GWAA for birth weight are presented in Fig. 1; with detailed summary data for 19 QTL which met a nominal significance threshold for polygenic traits (P ≤ 1e-05) described in Table 2 (Additional File 1) [71]. A comparison of birth weight QTL detected for U.S. Red Angus, Simmental, and Gelbvieh beef cattle as well as Holstein Jersey crossbred dairy cattle, revealed overlapping signals on BTA6, BTA14, and BTA20, suggesting that these birth weight QTL are not breed-specific, but rather, are likely to be more generally involved in bovine species growth processes (  Table 2 [ . Notably, all but two lead SNPs (i.e., 6_37 Mb, 6_35 Mb) were located in noncoding regions, which is concordant with recent studies of feed efficiency and growth traits in beef cattle (Table S1; Additional File 1, Additional File 2) [20,31]. Additionally, a QTL was detected on BTA6 (42 Mb), but with less statistical support, and included the positional candidate genes LOC782172 and ADGRA3; which have previously been associated with U.S. Gelbvieh growth traits (Table S2; Additional File 2) [20]. The genomic inflation factor for P-value estimates obtained from the birth weight GWAA are presented in Table S3 (Additional File 2). Single-marker GWAA (836K) for weaning weight in U.S. Red Angus beef cattle produced evidence for 14 QTL (P ≤ 1e-05), as defined by their lead SNPs (Table 3, Fig. 2; Additional File 1). Similar to a recent analysis of U.S. Gelbvieh beef cattle [20], the weaning weight QTL regions detected for U.S. Red Angus cattle suggest extensive pleiotropy with birth weight, as would be expected due to high genetic correlations between the two traits [4]. This includes the shared positional candidate genes on BTA6 (LCORL, LOC782905, HERC6, CCSER1, SLIT2, GRID2), BTA14 (LOC112449660), and BTA20 (LOC104975192, STC2, SH3PXD2B) ( Table 2, Table 3, Table S1; Additional File 2). Additional positional candidate genes identified for weaning weight QTL include those associated with growth and development (FAM184B, LOC112447052, NSG2, LOC112449630, MOS, PENK, MIR3660, CETN3) (Table 3) [73][74][75][76][77]. All weaning weight QTL detected by GWAA were located in noncoding regions. An additional pleiotropic QTL was noted on BTA6 (42 Mb), but with less statistical support, which is the same QTL detected in the birth weight GWAA (Table S2, Table S4; Additional File 2). The genomic inflation factor for P-value estimates obtained from the weaning weight GWAA are displayed in Table S3 (Additional File 2).   (Table 3) and yearling weight (Table 4) also revealed evidence for pleiotropic QTL influencing these traits via shared positional candidate genes on BTA6 and BTA14, including FAM184B and PENK, respectively. Positional candidate genes for QTL on BTA7, BTA20, and BTA21 which were only detected for yearling weight have been associated with general growth and development in Xenopus laevis (KCNIP1) [78], as well as bovine milk production (LOC112447488)      Fig. 3, Table S1; Additional File 2) [46]. Collectively, two of the 16 lead SNPs (20_05 Mb, STC2; 6_35 Mb, MMRN1) noted for yearling weight QTL were located within coding regions (Table 4). Interestingly, STC2 has previously been associated with body size, feed efficiency, and growth in cattle [20,24,31,40,42,47]; whereas MMRN1 has been associated with growth, feed efficiency, and metabolic stability during weather stress in cattle (Table 4, Table S1; Additional File 1, Additional File 2) [24,43,[56][57][58]. Despite less statistical support overall, the QTL on BTA6 at 42 Mb (i.e., LOC782172 and ADGRA3) was detected for birth weight, weaning weight, and yearling weight in the Red Angus GWAA (Table S2, Table S4, Table S5; Additional File 2); as was the QTL on BTA6 at 32 Mb (Table 2, Table 3, Table S5; Additional File 2), and the QTL at BTA7 at 91 Mb (  Additional File 1. The genomic inflation factor for yearling weight GWAA is reported in Table S3 (Additional File 2). Genomic inflation factors (λ) larger than 1 are expected for well-powered studies of polygenic traits [79,80], reflecting the large number of genomic loci influencing variation in these traits.

GxE GWAA for birth weight, weaning weight, and yearling weight in U.S. Red Angus beef cattle
To investigate GxE interactions in relation to birth weight, weaning weight, and yearling weight in U.S Red Angus beef cattle, we conducted additional single-marker (836K) analyses. All analyses included a variable for U.S. geographic ecoregion of origin, which was generated via K-means clustering using thirty-year climate data and treated as an interaction term, as previously described [20,32,72]. GxE GWAA for birth weight produced evidence for three interactions on BTA26 and BTA22 interacting with two ecoregions (Table 5, Fig. 4; Additional File 1). Positional candidate genes identified by GxE interactions for birth weight have been previously associated with cattle feed efficiency (PRKG1, LOC531679, SEC61G, and NEK10) (Table S1; Additional File 2) [81][82][83][84][85][86][87]. Additionally, PRKG1 is involved in vasodilation (Table 5) [82]. Notably, only one interaction detected by GxE GWAA for birth weight was identified as a coding variant (Additional File 1). More specifically, the lead SNP within the positional candidate gene NEK10 encodes a nonsynonymous mutation in exon 2 (Ser → Thr). Four additional interactions were also noted with less statistical support, as described in Table S6 (Additional File 2). Genomic inflation factors for P-value estimates obtained from GxE GWAA for birth weight are presented in Table S7 (Additional File 2). GxE GWAA for weaning weight in U.S. Red Angus beef cattle produced evidence for six significant interactions; thereby implicating positional candidate genes related to growth and development (DNAJC12), milk production (LOC112447568, TRNAE-UUC, LOC112447164, COX18), carcass traits (LOC112447496, LOC112447497, LOC782092), cellular proliferation and metabolism (SIRT1), and feed efficiency (LCLAT1), as defined by relevant lead SNPs (Table 6, Table S1, Fig. 5; Additional File 2) . Additionally, positional candidate genes identified on BTA28 (DNAJC12 and SIRT1) have been associated with bovine maturity rate, milk production, and meat quality traits [46,48,91]; as well as promoting cellular proliferation and regulation in humans and mice [90,92]. Interestingly, interactions on BTA7 (101 Mb) and BTA6 (88 Mb) revealed previous associations with aspects of bovine heat stress and thermotolerance, and both showed significant GxE interactions in the U.S. Desert Ecoregion (Table 6; Additional File 1) [89,104,105]. All interactions identified in the GxE GWAA for weaning weight in Red Angus cattle were located in noncoding regions (Table 6; Additional File 1). Sixteen additional interactions with less overall statistical support are noted in Table S8, with only one lead SNP encoding a nonsynonymous change within the positional candidate gene ANKK1 (Table S8; Additional File 1, Additional  Table 4 File 2). Genomic inflation factors for P-value estimates obtained from a GxE GWAA for weaning weight are summarized in Table S7 (Additional File 2).

Conclusions
Herein, we present evidence for pleiotropic QTL resulting from traditional GWAA influencing birth weight, weaning weight, and yearling weight in U.S. Red Angus beef cattle, and further confirm the involvement of genomic regions on BTA6 from 34 to 41 Mb and BTA14 from 23 to 25 Mb in various aspects of bovine growth, feed efficiency, carcass traits, and stature across breeds. Additionally, the results from our GWAA and GxE GWAA for birth weight, weaning weight, and yearling     weaning weight and yearling weight. Similar to previous GWAA and GxE GWAA on U.S. Gelbvieh and Simmental beef cattle, significant GxE associations detected for birth weight, weaning weight, and yearling weight in U.S. Red Angus cattle were not overlapping; thereby suggesting that although the majority of the main effect QTL were conserved between breeds, GxE interactions were not conserved. In agreement with previous GxE and local adaptation results in beef cattle, we find GxE effects associated with vasodilation and neural development. Identification of pleiotropic growth QTL and breed specific GxE interactions may potentially serve to benefit beef breeding programs across diverse U.S. climates via creation of region-specific genomic predictions. Moreover, the results of this study further demonstrate that imputation to a union set of high-density SNPs (i.e., 836K) can directly facilitate future studies at a fraction of the cost associated with direct genotyping; thus providing a research framework that directly enables large-scale analyses for economically important livestock species, and the potential for identifying causal variants via genome sequence-level imputed genotypes.    [129] package and the RStoolbox package [127,130] in R assigned every four kilometer (km) square of the continental U.S. to one of 9 clusters, denoted as ecoregions. These designated ecoregions consist of the Upper Midwest & Northeast, Fescue Belt, Rainforest, Forested Mountains, High Plains, Foothills, Desert, Southeast, and Arid Prairie. Animals were assigned to ecoregions by breeder zip-code as recorded in the U.S. Red Angus Association of America herdbook [32]. If the breeder's zip-code overlapped with two or more ecoregions, the animal was filtered from further analysis. Genotypes from 22,932 U.S. Red Angus cattle were provided by Neogen GeneSeek (Lincoln, NE, U.S.A). The ARS-UCD1.2 Bos taurus assembly [131] was used for SNP positions. The genotypes underwent filtering using PLINK 1.9 to remove individuals with call rates < 0.90 on an assay-by-assay basis (i.e., GeneSeek GGP-LDv3, GeneSeek GGP-LDv4, GeneSeek GGP-90KT, GeneSeek GGP-HDv3, GeneSeek Bovine-GGP-F250, Illumina Bovine SNP50, and Illumina HD 778K), removal of variants with call rates < 0.90 and Hardy-Weinberg Equilibrium (HWE) P-values < 1e-20 to exclude poorly genotyped loci [132]. Only autosomal chromosomes were utilized in these analyses. The remaining 22,457 individuals and associated genotypes were then merged and phased using PLINK and Eag-leV2.4, respectively [133]. Phased haplotypes for 8622 diverse individuals genotyped using the Illumina HD (778K SNPs; Illumina, San Diego, CA) and 28,114 individuals genotyped using the Bovine-GGP-F250 (250K SNPs; GeneSeek, Lincoln, NE) were used as a multibreed reference panel for imputation in minimac4 as previously described [19,134]. The 22,457 Red Angus genotypes from various assays were imputed for all markers contained on the two high-density research chips in this multi-breed reference panel. A total of 6642 cattle had only genotype information, thus providing 15,815 individuals with 836,118 markers each (ARS-UCD1.2) to be utilized as the final dataset for GWAA and GxE GWAA. Minimac4 reported imputed dosage genotypes to account for any potential uncertainty during imputation processes, as previously described [19,134].
Imputed genotypes (836K markers) and the adjusted phenotypes for Red Angus cattle were used to conduct univariate linear mixed model GWAA for birth weight (15,815 individuals), weaning weight (15,620 individuals), and yearling weight (12,388 individuals) using the program GEMMA. Prior to the execution of all GWAA, GEMMA filtered all SNP loci as follows: MAF (< 0.01 excluded), polymorphism (monomorphic SNPs excluded), and Hardy-Weinberg Equilibrium (HWE; P-values < 0.001 excluded), thereby producing genotypic sets of 675,115 SNPs for birth weight, 675,060 SNPs for weaning weight, and 674,493 SNPs for yearling weight. Genomic relationship matrices (G s ) were computed with the imputed genotypes in GEMMA to control for dependence between samples due to relatedness. The linear mixed models implemented in GEMMA also estimate the proportion of variance explained (PVE) by the genomic relationship matrix. The PVE is also referred to as "chip heritability" [135]. The univariate linear mixed model implemented for GWAA can be generally specified as: y = Wα + xβ + u + ϵ; where y is a n-vector of quantitative traits (i.e., birth weight, weaning weight, and yearling weight) for n-Red Angus individuals, W is an n x c matrix of specified covariates (i.e., fixed effects) including a column of 1s, α is a c-vector of corresponding coefficients including the intercept, x is an n-vector of SNP genotypes, β is the effect size of the SNP, u is an n-vector of random effects, and ϵ represents an n-vector of errors [20,135]. Additionally, u ∼ MVN n (0, λτ −1 Κ) and ϵ ∼ MVN n (0, τ −1 I), where MVN denotes multivariate normal distribution, τ −1 is the variance of the residual errors, λ is the ratio between the two variance components, Κ is the n x n genomic relatedness matrix, and I represents an n x n identity matrix [20,135]. Specifically, GEMMA performed a Wald test using -lmm 1 as follows: F Wald =β 2 Vβ , tests the alternative hypothesis ( H 1 :β � = 0 for each SNP against the null hypothesis for each SNP ( H 0 :β = 0 ).
Moreover, β = x T P c ˆ r x −1 x T P c ˆ r y is the estimate for β obtained using the restricted maximum likelihood (REML) estimate ˆ r in the alternative model; and the variance for β [20,135]. Under the null hypothesis, the Wald test statistics (F Wald ) come from an F(1, n − c − 1) distribution [20,135]; with GEMMA producing marker-based REML estimates and corresponding P-values. For all GxE GWAAs, discrete geographic ecoregion (i.e., the environmental variable) was specified as an interaction term using the -gxe command. GxE GWAA were computed using a mixed model which can be generally specified as: y = Wα + x snp β snp + x env β env + x snp × x env β snp × env + u + ϵ; where y is an n-vector of quantitative traits (i.e., birth weight, weaning weight, and yearling weight) for n-Red Angus individuals, W represents an n x c matrix of specified covariates, α is a c-vector of corresponding coefficients including the intercept, x snp represents an n-vector of SNP genotypes, β snp is the effect size of the SNP, x env is an n-vector of membership in a single ecoregion, β env represents the fixed effect of the ecoregion, β snp × env is the estimated interaction between SNP genotype and ecoregion, u is an n-vector of random effects, and ϵ is an n-vector of errors [20,135]. As above, u ∼ MVN n (0, λτ −1 Κ) and ϵ ∼ MVN n (0, τ −1 I). Each discrete ecoregion was compared against the remaining U.S. dataset using binary (0, 1) coding as an environmental variable with one exception; the Rainforest ecoregion had insufficient sample size for GxE GWAA, and thus eight separate GxE GWAA were computed (Additional File 1). GEMMA evaluated the alternative hypothesis for each interaction (H 1 : β snp × env ≠ 0) in comparison to the null hypothesis (H 0 : β snp × env = 0) using linear mixed models while controlling for population stratification, SNP main effect, and environmental effect while examining the interaction effect of each ecoregion [20,135]. Single-marker P-value results produced by GEMMA using the -lmm 1 and -gxe commands were further adjusted using chi-squared test statistics divided by a constant for additional genomic control [37,136]. Adjusted P-value results were utilized to produce Manhattan plots using the manhattan command in R [137]. All SNPs meeting the nominal significance threshold (P ≤ 1e-05) were rounded to the nearest Mb and strongly supported QTL were defined by ≥ 5 significant SNP loci with MAF ≥ 0.01 (i.e., a lead SNP plus four or more additional supporting SNPs within the same rounded Mb) [31,71]. Additional QTL were also noted with less overall statistical support in Tables S2, S4-S6, and S8-S9; thereby representing QTL defined by ≥ 2 but ≤ 4 SNP loci which met the nominal significance threshold (P ≤ 1e-05) within the same rounded Mb (Additional File 2). Positional candidate genes were implicated by location of the lead SNP. Genomic inflation factors (λ) were estimated using observed and expected P-values via regression for all GWAA and GxE GWAA in R [80,138,139]. The proportion of variance explained (PVE) by bovine SNPs was estimated as previously described [140]. Genetic correlations were estimated using the multivariate approach implemented in GEMMA [135,141], as previously described [142].
Additional file 1: Description and acronym definitions for summary data. Summary data for all genome-wide association analyses (GWAA) and genotype-by-environment genome-wide association analyses (GxE GWAA) performed for bovine birth weight, weaning weight, and yearling weight in tabular format.
Additional file 2: Table S1. Summary of QTL supporting pleiotropy detected for birth weight, weaning weight, and yearling weight GWAA and GxE GWAA in U.S. Red Angus cattle. Table S2. Summary of QTL with 2 to 4 supporting SNPs detected for birth weight in U.S. Red Angus cattle. Table S3. Genomic inflation factors (λ) calculated using observed P-values and expected P-values for GWAA for growth traits in U.S. Red Angus beef cattle. Table S4. Summary of QTL with 2 to 4 supporting SNPs detected for weaning weight in U.S. Red Angus cattle. Table S5. Summary of QTL with 2 to 4 supporting SNPs detected for yearling weight in U.S. Red Angus cattle. Table S6. Summary of GxE interactions with 2 to 4 supporting SNPs detected for birth weight in U.S. Red Angus cattle. Table S7. Genomic inflation factors (λ) calculated using observed P-values and expected P-values for GxE GWAA for growth traits in U.S. Red Angus beef cattle. Table S8. Summary of GxE interactions with 2 to 4 supporting SNPs detected for weaning weight in U.S. Red Angus cattle. Table S9. Summary of GxE interactions with 2 to 4 supporting SNPs detected for yearling weight in U.S. Red Angus cattle.